Support for out-of-source quantizers #3534

Giuseppe5 · 2025-10-30T15:31:57Z

The goal of this PR is to allow any user to easily change how quantization is applied during the fine tuning process.
A longer description of the issue can be found in #3521

By making _prepare_for_qat a staticmethod, it is possible to replace it like so:

def my_quant_func(...):
    ...
    return model

FastLlamaModel._prepare_for_qat = my_quant_func

As mentioned in the issue above, even allowing for a custom quantization function (and thus, custom quantizer like Brevitas), does not solve all the issues, since Unsloth makes a strong assumption about the names of the quantization functions for weights and activations.

This can be patched without necessarily changing anything within Unsloth, but I believe it might be worth thinking about a more general implementation for the naming conventions (either a task for this or another PR, you tell me).

cc @Datta0

danielhanchen · 2025-10-31T09:12:46Z

Thanks! This is ok, but we'll need to confirm if this might cause some issues downstream

danielhanchen · 2025-10-31T10:58:50Z

@rolandtannous Can you check if this causes any issues? Thanks

danielhanchen · 2025-11-09T04:07:54Z

@gemini-code-assist review

gemini-code-assist

Code Review

This pull request introduces a flexible way to apply custom quantization schemes by exposing _prepare_for_qat as a replaceable method on FastLlamaModel and FastModel. This is a great addition for extensibility and a good first step towards the goal described in issue #3521.

However, I've found critical issues in the implementation of the new methods in both unsloth/models/llama.py and unsloth/models/loader.py. The method signatures are incorrect for the decorators used (@staticmethod and @classmethod), which will lead to TypeError exceptions at runtime. My review comments include specific suggestions to fix these issues.

Regarding the naming conventions for quantization functions you mentioned, that's a valid concern for a more general solution. I agree it could be addressed in a follow-up PR to keep this one focused.

Once the signature issues are addressed, this PR will be a solid improvement.

gemini-code-assist · 2025-11-09T04:08:23Z

unsloth/models/llama.py

 class FastLlamaModel:

+    @staticmethod
+    def _prepare_for_qat(cls, model, qat_scheme):


The signature of the _prepare_for_qat static method is incorrect. A staticmethod does not receive the class as an implicit first argument. The cls parameter will cause a TypeError at runtime because the method will be called with two arguments (model, qat_scheme), but it expects three. To fix this, the cls parameter should be removed.

Suggested change

def _prepare_for_qat(cls, model, qat_scheme):

def _prepare_for_qat(model, qat_scheme):

gemini-code-assist · 2025-11-09T04:08:24Z

unsloth/models/loader.py

+    @classmethod
+    def _prepare_for_qat(model, qat_scheme):
+        model = _prepare_model_for_qat(model, qat_scheme)
+        return model


This method is decorated with @classmethod, but its signature (model, qat_scheme) doesn't account for the implicit cls argument that class methods receive, which will cause a TypeError. The first argument is also confusingly named model when it would receive the class object.

Since this method doesn't use the class object, and for consistency with FastLlamaModel, it should be a @staticmethod with the correct signature.

Suggested change

@classmethod

def _prepare_for_qat(model, qat_scheme):

model = _prepare_model_for_qat(model, qat_scheme)

return model

@staticmethod

def _prepare_for_qat(model, qat_scheme):

model = _prepare_model_for_qat(model, qat_scheme)

return model

gemini-code-assist

Code Review

This pull request aims to provide a way for users to customize the quantization process by exposing _prepare_for_qat as a replaceable method. This is a valuable addition for flexibility. However, I've found critical issues in the implementation of the new methods in both unsloth/models/llama.py and unsloth/models/loader.py. The method signatures are incorrect and will lead to TypeError exceptions at runtime. I have provided suggestions to fix these bugs. I've also pointed out a minor style issue regarding an unnecessary blank line.

gemini-code-assist · 2025-11-09T04:09:52Z

unsloth/models/llama.py

+    def _prepare_for_qat(cls, model, qat_scheme):
+        model = _prepare_model_for_qat(model, qat_scheme)
+        return model


The signature for the new static method _prepare_for_qat is incorrect. A static method does not receive the class as its first argument. The cls parameter will cause a TypeError at runtime because it will be populated with the model object from the call site, and the model parameter will receive qat_scheme, leaving the qat_scheme argument unfulfilled.

To fix this, you should remove the cls parameter from the method signature.

Suggested change

def _prepare_for_qat(cls, model, qat_scheme):

model = _prepare_model_for_qat(model, qat_scheme)

return model

def _prepare_for_qat(model, qat_scheme):

model = _prepare_model_for_qat(model, qat_scheme)

return model

gemini-code-assist · 2025-11-09T04:09:52Z

unsloth/models/loader.py

+    @classmethod
+    def _prepare_for_qat(model, qat_scheme):
+        model = _prepare_model_for_qat(model, qat_scheme)
+        return model


The new _prepare_for_qat method is decorated with @classmethod, but its signature is incorrect. The first argument to a class method is the class itself. Here, it's confusingly named model, and the actual model object is passed as the second argument, which is received by the qat_scheme parameter. This will lead to a TypeError at runtime.

For consistency with the changes in FastLlamaModel and to ensure correct behavior, this should be a staticmethod with the correct signature.

Suggested change

@classmethod

def _prepare_for_qat(model, qat_scheme):

model = _prepare_model_for_qat(model, qat_scheme)

return model

@staticmethod

def _prepare_for_qat(model, qat_scheme):

model = _prepare_model_for_qat(model, qat_scheme)

return model

Support for out-of-source quantizers

58c22f0

gemini-code-assist bot reviewed Nov 9, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support for out-of-source quantizers #3534

Support for out-of-source quantizers #3534

Uh oh!

Giuseppe5 commented Oct 30, 2025 •

edited

Loading

Uh oh!

danielhanchen commented Oct 31, 2025

Uh oh!

danielhanchen commented Oct 31, 2025

Uh oh!

danielhanchen commented Nov 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Uh oh!

gemini-code-assist bot Nov 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Uh oh!

gemini-code-assist bot Nov 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	def _prepare_for_qat(cls, model, qat_scheme):
	def _prepare_for_qat(model, qat_scheme):

Uh oh!

Support for out-of-source quantizers #3534

Are you sure you want to change the base?

Support for out-of-source quantizers #3534

Uh oh!

Conversation

Giuseppe5 commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

danielhanchen commented Oct 31, 2025

Uh oh!

danielhanchen commented Oct 31, 2025

Uh oh!

danielhanchen commented Nov 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Nov 9, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Giuseppe5 commented Oct 30, 2025 •

edited

Loading